首页> 外文OA文献 >Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
【2h】

Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks

机译:均衡量化:一种有效有效的量化方法   神经网络

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

Quantized Neural Networks (QNNs), which use low bitwidth numbers forrepresenting parameters and performing computations, have been proposed toreduce the computation complexity, storage size and memory usage. In QNNs,parameters and activations are uniformly quantized, such that themultiplications and additions can be accelerated by bitwise operations.However, distributions of parameters in Neural Networks are often imbalanced,such that the uniform quantization determined from extremal values may underutilize available bitwidth. In this paper, we propose a novel quantizationmethod that can ensure the balance of distributions of quantized values. Ourmethod first recursively partitions the parameters by percentiles into balancedbins, and then applies uniform quantization. We also introduce computationallycheaper approximations of percentiles to reduce the computation overheadintroduced. Overall, our method improves the prediction accuracies of QNNswithout introducing extra computation during inference, has negligible impacton training speed, and is applicable to both Convolutional Neural Networks andRecurrent Neural Networks. Experiments on standard datasets including ImageNetand Penn Treebank confirm the effectiveness of our method. On ImageNet, thetop-5 error rate of our 4-bit quantized GoogLeNet model is 12.7\%, which issuperior to the state-of-the-arts of QNNs.
机译:提出了使用低位宽数字表示参数和执行计算的量化神经网络(QNN),以降低计算复杂性,存储大小和内存使用量。在QNN中,参数和激活被统一量化,从而可以通过按位运算来加速乘法和加法运算。在本文中,我们提出了一种新颖的量化方法,可以确保量化值分布的平衡。我们的方法首先将参数按百分位数递归划分为平衡仓,然后应用统一量化。我们还介绍了百分位数的更便宜的计算近似值,以减少引入的计算开销。总体而言,我们的方法提高了QNN的预测精度,而无需在推理过程中引入额外的计算,对训练速度的影响可以忽略不计,并且适用于卷积神经网络和递归神经网络。在包括ImageNet和Penn Treebank在内的标准数据集上进行的实验证实了我们方法的有效性。在ImageNet上,我们的4位量化GoogLeNet模型的top-5错误率是12.7%,这比QNN的最新水平要好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号